Search CORE

An AC-type element mediates transactivation of secondary cell wall carbohydrate-active enzymes by PttMYB021, the Populus MYB46 orthologue

Author: A Winzell
Anders Winzell
Camilla Johansson
D Hatton
Henrik Aspeborg
Ines Ezcurra
J Raes
J Zhou
K Ohashi-Ito
KD Hauffe
M Goicoechea
M Yamaguchi
Prashanth Tamizhselvan
R Zhong
S Fornalé
S Legay
Y Pilpel
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Springer - Publisher Connector

Public Library of Science (PLOS)

Removal of AU Bias from Microarray mRNA Expression Data Enhances Computational Identification of Active MicroRNAs

Author: A Rodriguez
BP Lewis
C Cheadle
CZ Chen
E Segal
F Fazi
F van Ruissen
J Brennecke
J Tsang
JC Huang
KK Farh
L Bruno
LP Lim
M Reimers
MS Lee
N Felli
N Rajewsky
P Flicek
P Sood
QJ Li
R Elkon
RA Irizarry
Ran Elkon
Reuven Agami
RW Georgantas 3rd
S Landais
S Pradervand
T Liu
VN Kim
Y Pilpel
YH Yang
Yitzhak Pilpel
Publication venue: Public Library of Science
Publication date: 03/10/2008
Field of study

Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology. The integrated analysis of gene expression data and 3′-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes. Applying such an integrated analysis, we uncovered a striking relationship between 3′-UTR AU content and gene response in numerous microarray datasets. We show that this relationship is secondary to a general bias that links gene response and probe AU content and reflects the fact that in the majority of current arrays probes are selected from target transcript 3′-UTRs. Therefore, removal of this bias, which is in order in any analysis of microarray datasets, is of crucial importance when integrating expression data and 3′-UTR sequences to identify regulatory elements embedded in this region. We developed visualization and normalization schemes for the detection and removal of such AU biases and demonstrate that their application to microarray data significantly enhances the computational identification of active miRs. Our results substantiate that, after removal of AU biases, mRNA expression profiles contain ample information which allows in silico detection of miRs that are active in physiological conditions

Public Library of Science (PLOS)

Genome-Wide Survey for Biologically Functional Pseudogenes

Author: Graur D Shuali Y, Li
Jens Lagergren
Lars Arvestad
Orr HT Chung MY, Banfi S, Kwiatkowski TJ Jr, Servadio A, et al.
Waterston RH Lindblad-Toh K, Birney E, Rogers J, Abril JF, et al.
Yano Y Saito R, Yoshida N, Yoshiki A, Wynshaw-Boris A, et al.
Yitzhak Pilpel
Zheng D Zhang Z, Harrison P, Karro J, Carriero N, et al.
Örjan Svensson
Publication venue: Public Library of Science
Publication date: 01/01/2005
Field of study

According to current estimates there exist about 20,000 pseudogenes in a mammalian genome. The vast majority of these are disabled and nonfunctional copies of protein-coding genes which, therefore, evolve neutrally. Recent findings that a Makorin1 pseudogene, residing on mouse Chromosome 5, is, indeed, in vivo vital and also evolutionarily preserved, encouraged us to conduct a genome-wide survey for other functional pseudogenes in human, mouse, and chimpanzee. We identify to our knowledge the first examples of conserved pseudogenes common to human and mouse, originating from one duplication predating the human–mouse species split and having evolved as pseudogenes since the species split. Functionality is one possible way to explain the apparently contradictory properties of such pseudogene pairs, i.e., high conservation and ancient origin. The hypothesis of functionality is tested by comparing expression evidence and synteny of the candidates with proper test sets. The tests suggest potential biological function. Our candidate set includes a small set of long-lived pseudogenes whose unknown potential function is retained since before the human–mouse species split, and also a larger group of primate-specific ones found from human–chimpanzee searches. Two processed sequences are notable, their conservation since the human–mouse split being as high as most protein-coding genes; one is derived from the protein Ataxin 7-like 3 (ATX7NL3), and one from the Spinocerebellar ataxia type 1 protein (ATX1). Our approach is comparative and can be applied to any pair of species. It is implemented by a semi-automated pipeline based on cross-species BLAST comparisons and maximum-likelihood phylogeny estimations. To separate pseudogenes from protein-coding genes, we use standard methods, utilizing in-frame disablements, as well as a probabilistic filter based on Ka/Ks ratios

CiteSeerX

Incorporating Existing Network Information into Gene Network Inference

Author: A Meissner
A Sharov
AA Margolin
AV Werhli
B Efron
BE Bernstein
C Jiang
Cathal Seoighe
D Gilbert
E Segal
ER Mardis
F Mordelet
H de Jong
H Li
H Zou
I Park
J Friedman
J Friedman
J Kim
J Yu
JJ Faith
K Knight
K Okita
K Okita
K Takahashi
K Tan
M Bansal
M Bansal
M Gustafsson
M Stadtfeld
M Wernig
ME Donohoe
N Friedman
N Ivanova
N Tsubooka
O Banerjee
P Tseng
Q Zhou
Qing Nie
R Bonneau
R Bonneau
R Tibshirani
RW Kennard
S Mukherjee
Scott Christley
TS Gardner
TS Mikkelsen
X Chen
X Zhang
Xiaohui Xie
XY Li
Y Chen
Y Pilpel
Y Tamada
Y Wang
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

One methodology that has met success to infer gene networks from gene expression data is based upon ordinary differential equations (ODE). However new types of data continue to be produced, so it is worthwhile to investigate how to integrate these new data types into the inference procedure. One such data is physical interactions between transcription factors and the genes they regulate as measured by ChIP-chip or ChIP-seq experiments. These interactions can be incorporated into the gene network inference procedure as a priori network information. In this article, we extend the ODE methodology into a general optimization framework that incorporates existing network information in combination with regularization parameters that encourage network sparsity. We provide theoretical results proving convergence of the estimator for our method and show the corresponding probabilistic interpretation also converges. We demonstrate our method on simulated network data and show that existing network information improves performance, overcomes the lack of observations, and performs well even when some of the existing network information is incorrect. We further apply our method to the core regulatory network of embryonic stem cells utilizing predicted interactions from two studies as existing network information. We show that including the prior network information constructs a more closely representative regulatory network versus when no information is provided

CiteSeerX

eScholarship - University of California

A classification-based framework for predicting and analyzing gene regulatory response

Author: AJ Hartemink
Anshul Kundaje
AP Gasch
AP Gasch
Chris H Wiggins
Christina Leslie
CI Holmberg
D Pe'er
D Pe'er
D Pollard
DC Raitt
E Ramil
E Segal
E Segal
ER Gansner
HJ Bussemaker
I Ota
I Pedruzzi
J Ihmels
JD Hughes
JT Lin
M Middendorf
M Middendorf
M Middendorf
MA Beer
Manuel Middendorf
Mihir Shah
P Zarzov
RE Schapire
TI Lee
VK Vyas
W Hoeffding
Y Pilpel
Yoav Freund
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: We have recently introduced a predictive framework for studying gene transcriptional regulation in simpler organisms using a novel supervised learning algorithm called GeneClass. GeneClass is motivated by the hypothesis that in model organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular microarray experiment based on the presence of binding site subsequences ("motifs") in the gene's regulatory region and the expression levels of regulators such as transcription factors in the experiment ("parents"). GeneClass formulates the learning task as a classification problem — predicting +1 and -1 labels corresponding to up- and down-regulation beyond the levels of biological and measurement noise in microarray measurements. Using the Adaboost algorithm, GeneClass learns a prediction function in the form of an alternating decision tree, a margin-based generalization of a decision tree. METHODS: In the current work, we introduce a new, robust version of the GeneClass algorithm that increases stability and computational efficiency, yielding a more scalable and reliable predictive model. The improved stability of the prediction tree enables us to introduce a detailed post-processing framework for biological interpretation, including individual and group target gene analysis to reveal condition-specific regulation programs and to suggest signaling pathways. Robust GeneClass uses a novel stabilized variant of boosting that allows a set of correlated features, rather than single features, to be included at nodes of the tree; in this way, biologically important features that are correlated with the single best feature are retained rather than decorrelated and lost in the next round of boosting. Other computational developments include fast matrix computation of the loss function for all features, allowing scalability to large datasets, and the use of abstaining weak rules, which results in a more shallow and interpretable tree. We also show how to incorporate genome-wide protein-DNA binding data from ChIP chip experiments into the GeneClass algorithm, and we use an improved noise model for gene expression data. RESULTS: Using the improved scalability of Robust GeneClass, we present larger scale experiments on a yeast environmental stress dataset, training and testing on all genes and using a comprehensive set of potential regulators. We demonstrate the improved stability of the features in the learned prediction tree, and we show the utility of the post-processing framework by analyzing two groups of genes in yeast — the protein chaperones and a set of putative targets of the Nrg1 and Nrg2 transcription factors — and suggesting novel hypotheses about their transcriptional and post-transcriptional regulation. Detailed results and Robust GeneClass source code is available for download from

Springer - Publisher Connector

Columbia University Academic Commons

Prioritization of gene regulatory interactions from large-scale modules in yeast

Author: A Tanay
A-L Barabasi
CT Harbison
D Greenbaum
E Schweizer
E Segal
F Gao
F Rolland
G Lesage
Ho-Joon Lee
I Simon
J Ihmels
K Lemmens
LH Hartwell
LL Newcomb
M Kellis
M Koranda
Martin Vingron
N Zhang
P Cliften
P Prochasson
PT Spellman
R Siddharthan
Ricardo Bringas
S Rahmann
S Tavazoie
SW Doniger
T Manke
T Yu
Thomas Manke
TI Lee
V Matys
VR Iyer
W-S Wu
X Xu
Y Pilpel
Z Bar-Joseph
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The identification of groups of co-regulated genes and their transcription factors, called transcriptional modules, has been a focus of many studies about biological systems. While methods have been developed to derive numerous modules from genome-wide data, individual links between regulatory proteins and target genes still need experimental verification. In this work, we aim to prioritize regulator-target links within transcriptional modules based on three types of large-scale data sources. Results Starting with putative transcriptional modules from ChIP-chip data, we first derive modules in which target genes show both expression and function coherence. The most reliable regulatory links between transcription factors and target genes are established by identifying intersection of target genes in coherent modules for each enriched functional category. Using a combination of genome-wide yeast data in normal growth conditions and two different reference datasets, we show that our method predicts regulatory interactions with significantly higher predictive power than ChIP-chip binding data alone. A comparison with results from other studies highlights that our approach provides a reliable and complementary set of regulatory interactions. Based on our results, we can also identify functionally interacting target genes, for instance, a group of co-regulated proteins related to cell wall synthesis. Furthermore, we report novel conserved binding sites of a glycoprotein-encoding gene, CIS3, regulated by Swi6-Swi4 and Ndd1-Fkh2-Mcm1 complexes. Conclusion We provide a simple method to prioritize individual TF-gene interactions from large-scale transcriptional modules. In comparison with other published works, we predict a complementary set of regulatory interactions which yields a similar or higher prediction accuracy at the expense of sensitivity. Therefore, our method can serve as an alternative approach to prioritization for further experimental studies.</p

Springer - Publisher Connector

Systems-wide analysis of manganese deficiency-induced changes in gene activity of Arabidopsis roots

Author: A De Angeli
A Ihnatowicz
A Mortazavi
A Schneider
AL Socha
C Huang
C Nouet
C Vogel
C Vogel
CA Hebbern
CM Palmer
G Glauser
H Yi
IC Pan
J Cox
J Rodríguez-Celma
J Rodríguez-Celma
K Buxdorf
K Schlaeppi
KJ Livak
M Müller
MA Estelle
MD Allen
MJ Milner
NB Schmid
NK Clay
P Fourcroy
P Lan
P Pedas
R Cailliatte
RD Abreu
S Husted
SB Schmidt
TG Andersen
TJ Yang
V Lanquar
WD Lin
WJ Kent
Y Pilpel
YO Korshunova
Z Ma
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Manganese (Mn) is pivotal for plant growth and development, but little information is available regarding the strategies that evolved to improve Mn acquisition and cellular homeostasis of Mn. Using an integrated RNA-based transcriptomic and high-throughput shotgun proteomics approach, we generated a comprehensive inventory of transcripts and proteins that showed altered abundance in response to Mn deficiency in roots of the model plant Arabidopsis. A suite of 22,385 transcripts was consistently detected in three RNA-seq runs; LC-MS/MS-based iTRAQ proteomics allowed the unambiguous determination of 11,606 proteins. While high concordance between mRNA and protein expression (R = 0.87) was observed for transcript/protein pairs in which both gene products accumulated differentially upon Mn deficiency, only approximately 10% of the total alterations in the abundance of proteins could be attributed to transcription, indicating a large impact of protein-level regulation. Differentially expressed genes spanned a wide range of biological functions, including the maturation, translation, and transport of mRNAs, as well as primary and secondary metabolic processes. Metabolic analysis by UPLC-qTOF-MS revealed that the steady-state levels of several major glucosinolates were significantly altered upon Mn deficiency in both roots and leaves, possibly as a compensation for increased pathogen susceptibility under conditions of Mn deficiency

HAL Descartes

University of East Anglia digital repository

ProdInra

A stochastic differential equation model for transcriptional regulatory networks

Author: Adriana Climescu-Haulica
CT Harbison
CW Garvie
G Casella
H Akaike
HC Chen
KC Chen
M Kaern
Michelle D Quirk
PT Spellman
RJ Cho
S Chu
S Weisberg
TI Lee
Y Pilpel
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Public Library of Science (PLOS)

Comprehensive Network Analysis of Anther-Expressed Genes in Rice by the Combination of 33 Laser Microdissection and 143 Spatiotemporal Microarrays

Author: A Oikawa
Abidur Rahman
AM Koltunow
AM Sorensen
B Usadel
C de Azevedo Souza
D Swarbreck
D Zhang
E Grienenberger
F Ahlers
FF Fu
G Suzuki
Go Suzuki
H Bubert
H Li
H Masuko
Hirokazu Takahashi
HK Lee
IW Manfield
J Ihmels
J Kaur
J Rozema
J Zhu
J-S Jeon
JD Higgins
K Aoki
K Aya
K Hamada
K Suwabe
K Vandepoele
K Yamada
K-I Nonomura
K-I Nonomura
K-I Nonomura
K-I Nonomura
Katsuhiro Shiono
Keita Suwabe
Kentaro Yano
KH Jung
Koichiro Aya
L Chang
L Zhang
M Amagai
M Endo
M Mutwil
M Mutwil
M Shinohara
M Wang
Makoto Matsuoka
Masao Watanabe
MG Aarts
Mikio Nakazono
MY Hirai
Nobuhiro Tsutsumi
P Rubinelli
R Scott
S Persson
SP Ficklin
SS Kim
SS Sugano
T Ariizumi
T Hobo
T Obayashi
T Obayashi
T Obayashi
T Obayashi
T Tsuchiya
TA Long
TH Lee
Tokunori Hobo
V Srinivasasainagendra
W Yuan
Y Hihara
Y Ogata
Y Okazaki
Y Pilpel
Y Sato
Y Watanabe
Yoshiaki Nagamura
ZB Zhang
ZY Deng
Publication venue: Public Library of Science
Publication date: 26/10/2011
Field of study

Co-expression networks systematically constructed from large-scale transcriptome data reflect the interactions and functions of genes with similar expression patterns and are a powerful tool for the comprehensive understanding of biological events and mining of novel genes. In Arabidopsis (a model dicot plant), high-resolution co-expression networks have been constructed from very large microarray datasets and these are publicly available as online information resources. However, the available transcriptome data of rice (a model monocot plant) have been limited so far, making it difficult for rice researchers to achieve reliable co-expression analysis. In this study, we performed co-expression network analysis by using combined 44 K agilent microarray datasets of rice, which consisted of 33 laser microdissection (LM)-microarray datasets of anthers, and 143 spatiotemporal transcriptome datasets deposited in RicexPro. The entire data of the rice co-expression network, which was generated from the 176 microarray datasets by the Pearson correlation coefficient (PCC) method with the mutual rank (MR)-based cut-off, contained 24,258 genes and 60,441 genes pairs. Using these datasets, we constructed high-resolution co-expression subnetworks of two specific biological events in the anther, “meiosis” and “pollen wall synthesis”. The meiosis network contained many known or putative meiotic genes, including genes related to meiosis initiation and recombination. In the pollen wall synthesis network, several candidate genes involved in the sporopollenin biosynthesis pathway were efficiently identified. Hence, these two subnetworks are important demonstrations of the efficiency of co-expression network analysis in rice. Our co-expression analysis included the separated transcriptomes of pollen and tapetum cells in the anther, which are able to provide precise information on transcriptional regulation during male gametophyte development in rice. The co-expression network data presented here is a useful resource for rice researchers to elucidate important and complex biological events